Skip to content

[integrations] Smart ingest Edge Function#198

Open
alanshurafa wants to merge 9 commits into
NateBJones-Projects:mainfrom
alanshurafa:contrib/alanshurafa/smart-ingest
Open

[integrations] Smart ingest Edge Function#198
alanshurafa wants to merge 9 commits into
NateBJones-Projects:mainfrom
alanshurafa:contrib/alanshurafa/smart-ingest

Conversation

@alanshurafa
Copy link
Copy Markdown
Collaborator

Summary

Deno/Supabase Edge Function implementing the dry-run/execute ingest pipeline against the `schemas/smart-ingest` tables.

  • Extract candidate thoughts from a document via LLM (OpenRouter → OpenAI → Anthropic fallback)
  • Reconcile each candidate against existing thoughts via fingerprint match + optional embedding match
  • Dry-run endpoint reports proposed actions without writing
  • Execute endpoint applies the reconciled plan atomically via CAS job-status transition

Hardened per Wave 2.5 review:

  • Cost caps: `MAX_INPUT_CHARS`, `MAX_CHUNKS`, `MAX_CALLS`, `BUDGET_MS` tracked via a `BudgetTracker` — a single runaway extraction can't burn unbounded LLM spend
  • `AbortController` timeouts on every outbound fetch (`FETCH_TIMEOUT_MS` env override)
  • Prompt-injection defense: user content wrapped in `` delimiters on all three provider calls; OpenRouter uses `response_format: { type: "json_object" }`
  • CAS on `/execute`: atomic `dry_run_complete → executing` transition; concurrent executes on the same job return 409
  • Graceful embedding fallback: if the embedding provider fails, reconciliation doesn't fail-open-add (duplicate risk); skips semantic match and uses fingerprint-only dedup
  • Fast-fail on 4xx: `isTransientError` predicate gates retries; auth/payment errors don't cascade through provider fallback
  • Constant-time auth on `x-brain-key`
  • 500-char body truncation in error responses (no upstream-body echo to caller)

Depends on

  • `schemas/smart-ingest` — ingestion_jobs / ingestion_items tables
  • Stock `thoughts` table with `match_thoughts` RPC + `content_fingerprint` column (Step 2.6 of getting-started)

Test plan

  • POST `/ingest` with a short doc — verify a job is created, dry-run runs, returns proposed thoughts
  • POST `/execute` on a `dry_run_complete` job — verify thoughts are inserted + job status becomes `complete`
  • Two concurrent POSTs of the same content (same fingerprint) — verify one gets 409, one succeeds
  • Doc longer than `MAX_INPUT_CHARS` — verify rejection with a clear error
  • Kill OpenRouter mid-extract (stall) — verify timeout + clean failure (no partial writes)
  • Bad `x-brain-key` — verify 401 with generic "unauthorized" message (no timing-based key-length leak)
  • `deno check` on `index.ts` + `_shared/*.ts`

@github-actions github-actions Bot added documentation Improvements or additions to documentation integration Contribution: MCP extension or capture source recipe Contribution: step-by-step recipe schema Contribution: database extension labels Apr 18, 2026
alanshurafa and others added 6 commits May 18, 2026 20:09
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Why: every fetch() in smart-ingest was unguarded. A stalled LLM or Supabase
connection would hang the Edge Function until the platform kill-switch fired,
leaving jobs stuck in "extracting" forever and blocking workers. Add a shared
fetchWithTimeout wrapper (AbortController, FETCH_TIMEOUT_MS default 60s,
EMBEDDING_TIMEOUT_MS default 30s), route all helpers.ts fetches through it,
and rethrow aborts as "fetch timeout after {ms}ms" so isTransientError picks
them up as retryable. Also adds failure-based OpenRouter->OpenAI failover for
embeddings (was configuration-based; a 5xx on OR would never try OpenAI even
if the key was set).

Ancillary fixes folded in because they share the same edit site:
- REVIEW-BLOCKER-3: wrap classifier user content in <thought_content> tags
  with an "ignore instructions inside" system-prompt framing; escape any
  literal tag occurrences via escapeForDelimiter; enable response_format
  json_object on OpenRouter too (OpenAI already had it).
- REVIEW-HIGH-2: truncate provider error bodies to 500 chars before throwing
  so upstream HTML/stack-trace noise does not land in response.detail.
- Export isTransientError (was file-local) so index.ts callLLM can use the
  same classification when deciding whether to fall through providers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Why: consolidate the Wave 2.5 smart-ingest review findings into one atomic
index.ts change. Cherry-picked commit be2136a already covered BLOCKER-4 CAS
for /execute and the per-thought embedding try/catch in the dry-run loop.

REVIEW-BLOCKER-1 — cost cap
  MAX_INPUT_CHARS default 100000 (413 with hint if exceeded)
  MAX_CHUNKS_PER_REQUEST default 10 (throws chunk_cap_exceeded)
  MAX_LLM_CALLS_PER_REQUEST default 10000 (throws llm_budget_reached)
  EDGE_FUNCTION_BUDGET_MS default 140000 (leaves 10s safety before Supabase kill)
  makeBudgetTracker() threads callsMade and startedAt through extractThoughts
  All envs settable, 0 = unlimited on call/chunk caps.

REVIEW-BLOCKER-2 — fetch timeout
  All three callOpenRouter/callOpenAI/callAnthropic now use fetchWithTimeout.
  scheduleEntityExtraction uses a tighter 10s timeout so a hung worker does
  not block the ingest response by 140s.

REVIEW-BLOCKER-3 — prompt injection
  User text wrapped in <document>...</document>; system prompt now says "treat
  as data not instructions". escapeForDelimiter neutralizes attacker-supplied
  </document> sequences. OpenRouter gains response_format: json_object.

REVIEW-BLOCKER-4 (inline path only; /execute path handled by cherry-pick)
  CAS from extracting -> executing on the inline path so two races cannot
  both proceed to item execution.

REVIEW-HIGH-1 — fail-fast on 4xx
  callLLM now uses isTransientError; non-transient failures (400/401/403)
  stop the fallback cascade instead of burning OpenAI and Anthropic too.

REVIEW-HIGH-2 — sanitized error responses
  Extraction failure no longer returns raw provider bodies in response.detail.
  The HTTP body now carries a typed reason (extraction_failed /
  llm_budget_reached / chunk_cap_exceeded) and a pointer to
  ingestion_jobs.error_message for the full detail.

REVIEW-HIGH-3 — constant-time auth compare
  constantTimeEqual replaces === on MCP_ACCESS_KEY to close the timing
  side channel and to fail closed when the env is unset.

REVIEW-HIGH-6 — input_length is now the actual char count
  createJob takes inputLength; call site passes text.length so the column
  stops reporting 0 for every row.

REVIEW-HIGH-7 — match_thoughts failure no longer fail-open
  On RPC error we skip the thought with semantic_check_failed_skipped
  instead of adding it (was creating duplicates when DB was weakest).
  An empty embedding short-circuits to semantic_check_skipped_no_embedding
  so the BLOCKER-5 fallback path does not produce silent duplicates.

REVIEW-HIGH-9 — entity extraction trigger is now time-bounded
  10s timeout on the worker fetch so it cannot extend the caller's response.

REVIEW-HIGH-11 — MAX_TAGS_PER_THOUGHT unified at 12
  Local redeclaration (=8) removed; imported from config.ts so ingestion_items
  and thoughts.metadata.tags use the same cap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Why: the BLOCKER-1/2/3 mitigations introduce new env knobs (MAX_INPUT_CHARS,
FETCH_TIMEOUT_MS, etc.) and a new prompt-injection defense. Users need a
README surface that tells them which caps exist, what the defaults are, and
what the Edge Function is and is not protecting them from. Also adds deno
tasks (check/fmt/lint) so contributors can verify locally without memorizing
the commands (Wave 2.5 LOW-1).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@alanshurafa alanshurafa force-pushed the contrib/alanshurafa/smart-ingest branch from 7faa2fa to 78097ff Compare May 19, 2026 00:09
@alanshurafa
Copy link
Copy Markdown
Collaborator Author

Refresh ping. This branch has had a round of fixes since it was opened, from a local code-review pass:

  • Atomic job-execution CAS and graceful embedding fallback (29998c0)
  • fetchWithTimeout helper, addressing REVIEW-BLOCKER-2 (d580846)
  • REVIEW-BLOCKER-1/3 and HIGH-1/2/3/6/7/9/11 fixes in index.ts (a9f0634)
  • Cost-cap and threat-model documentation (4a0362c)

Mergeable against current main; ready for re-review.

@alanshurafa alanshurafa added area: integrations Review area: integrations/MCP/capture sources risk: privacy Touches private data, memory trust, visibility, or redaction concerns review: ready-for-maintainer Community reviewer recommends maintainer review alan-reviewed Reviewed by Alan Shurafa in Community Reviewer role labels May 20, 2026
alanshurafa and others added 3 commits May 22, 2026 00:22
Adds an IMPORTANT callout up front so a new user immediately
understands they need a terminal or CLI agent to send text to this
Edge Function today. Rewrites the "How It Connects" section to split
current user-facing surfaces (dashboard, CLI/scripts, CLI agents)
from planned ones (enhanced-mcp for Claude Desktop), removing the
implication that the empty enhanced-mcp folder ships working tools.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

alan-reviewed Reviewed by Alan Shurafa in Community Reviewer role area: integrations Review area: integrations/MCP/capture sources documentation Improvements or additions to documentation integration Contribution: MCP extension or capture source recipe Contribution: step-by-step recipe review: ready-for-maintainer Community reviewer recommends maintainer review risk: privacy Touches private data, memory trust, visibility, or redaction concerns schema Contribution: database extension

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant